Cloud computing has gained interest amongst commercial organizations,research communities, developers and other individuals during the past fewyears.In order to move ahead with research in field of data management andprocessing of such data, we need benchmark datasets and freely available datawhich are publicly accessible. Google in May 2011 released a trace of a clusterof 11k machines referred as Google Cluster Trace.This trace contains cellinformation of about 29 days.This paper provides analysis of resource usage andrequirements in this trace and is an attempt to give an insight into such kindof production trace similar to the ones in cloud environment.The majorcontributions of this paper include Statistical Profile of Jobs based onresource usage, clustering of Workload Patterns and Classification of jobs intodifferent types based on k-means clustering.Though there have been earlierworks for analysis of this trace, but our analysis provides several newfindings such as jobs in a production trace are trimodal and there occurssymmetry in the tasks within a long job type
展开▼